It has long been considered a significant problem to improve the visualquality of lossy image and video compression. Recent advances in computingpower together with the availability of large training data sets has increasedinterest in the application of deep learning cnns to address image recognitionand image processing tasks. Here, we present a powerful cnn tailored to thespecific task of semantic image understanding to achieve higher visual qualityin lossy compression. A modest increase in complexity is incorporated to theencoder which allows a standard, off-the-shelf jpeg decoder to be used. Whilejpeg encoding may be optimized for generic images, the process is ultimatelyunaware of the specific content of the image to be compressed. Our techniquemakes jpeg content-aware by designing and training a model to identify multiplesemantic regions in a given image. Unlike object detection techniques, ourmodel does not require labeling of object positions and is able to identifyobjects in a single pass. We present a new cnn architecture directedspecifically to image compression, which generates a map that highlightssemantically-salient regions so that they can be encoded at higher quality ascompared to background regions. By adding a complete set of features for everyclass, and then taking a threshold over the sum of all feature activations, wegenerate a map that highlights semantically-salient regions so that they can beencoded at a better quality compared to background regions. Experiments arepresented on the Kodak PhotoCD dataset and the MIT Saliency Benchmark dataset,in which our algorithm achieves higher visual quality for the same compressedsize.
展开▼